Skip to content

feat(scaling): engine.io WS transport-level packing — lever 8 prototype#7772

Closed
JohnMcLear wants to merge 1 commit into
developfrom
feat/engine-io-ws-packing
Closed

feat(scaling): engine.io WS transport-level packing — lever 8 prototype#7772
JohnMcLear wants to merge 1 commit into
developfrom
feat/engine-io-ws-packing

Conversation

@JohnMcLear
Copy link
Copy Markdown
Member

Summary

Prototype for the engine.io transport-level packing direction from #7767 — the meatiest untouched lever in the scaling-dive doc.

What changes. engine.io's WebSocket transport sends one WS frame per engine.io packet, even when the engine.io socket has multiple packets buffered. The polling transport doesn't — its `transport.send(packets)` runs through `encodePayload` and writes one HTTP response containing the whole payload. This PR monkey-patches engine.io's WS transport so multi-packet flushes go out as ONE frame containing the payload-encoded bundle. Single-packet sends keep the legacy fast path (including the pre-encoded-frame optimisation), so steady-state quiet-pad behaviour is identical to upstream.

Why. At high emit rate the WS path is dominated by per-frame syscalls on the server and per-message callback overhead on the client. The dive measured this directly via lever 4 (`socketTransportProtocols: ["websocket"]`): dropping the polling fallback at 200 authors quadrupled client p95 and nearly tripled `apply_mean`. The polling fallback was acting as a natural coalescer; this PR brings the same coalescing property to the WS path.

Implementation

  • New module `src/node/utils/EnginePacking.ts` — installs the patch at boot. Idempotent. Resolves engine.io's own `engine.io-parser` instance so we use the exact `encodePayload` the transport already imports.
  • `src/node/hooks/express/socketio.ts` — calls the installer once, BEFORE constructing the socket.io `Server`, so the patched prototype is in effect when transports are created.
  • `src/node/utils/Settings.ts` — `enginePacking: false` default, documented warning about client compatibility.

Rollout warning

Enabling this on a server with clients that haven't been updated will silently break those clients: a multi-packet frame fed to engine.io-client's stock `decodePacket` returns garbage. The harness-side patch in ether/etherpad-load-test#TBD is forward-compatible (detects `\x1e` and routes through `decodePayload`). For production rollout the etherpad browser bundle would need the same forward-compat patch — out of scope for this prototype.

Scoring this

Once merged (or via `core_ref=feat/engine-io-ws-packing`):

```
gh workflow run "Scaling dive" --repo ether/etherpad-load-test --ref main \
-f core_ref=feat/engine-io-ws-packing \
-f sweep='authors=100..500:step=50:dwell=8s:warmup=2s'
```

The dive workflow includes a new `engine-packing` matrix entry that sets `enginePacking: true` in `settings.json`. Expectations:

  • `emits_NEW_CHANGES` count unchanged (we still emit N application-level messages).
  • WS frame count on the wire reduced proportionally to packets-per-flush.
  • Server CPU and `apply_mean` drop at the cliff, since per-frame syscall overhead drops.
  • Client-side perceived latency improves if the per-message callback overhead was a significant chunk of the wait.

Files touched

  • `src/node/utils/EnginePacking.ts` (new, ~50 lines incl. comments)
  • `src/node/hooks/express/socketio.ts` (+10 lines, conditional require)
  • `src/node/utils/Settings.ts` (+1 type field, +14 documented default)

No tests yet — the patch is best validated end-to-end via the dive workflow. A unit test that constructs an in-memory WebSocket transport and asserts the framing is feasible but high effort for a prototype; happy to add it before merge if reviewers want.

🤖 Generated with Claude Code

Issue: engine.io's WebSocket transport sends one WS frame per
engine.io packet, even when the engine.io socket has multiple packets
buffered (see #7767). The polling transport already
coalesces — `transport.send(packets)` goes through
`encodePayload(packets)` and writes one HTTP response containing
the whole payload. At high emit rate the WS path is dominated by
per-frame syscalls on the server and per-message callback overhead
on the client; that's why dropping the polling fallback (lever 4)
made things sharply worse.

This patch monkey-patches engine.io's WebSocket transport prototype
so `send(packets)` with N > 1 packets goes through `encodePayload`
and emits ONE WS frame containing the multi-packet payload — the
same wire bytes the polling transport already uses. Single-packet
sends keep the legacy fast path including the pre-encoded-frame
optimisation, so steady-state quiet-pad behaviour is identical to
upstream.

Gated by settings.enginePacking (default false). Receiving clients
must recognise payload-encoded frames (split on `\x1e` and
decodePayload). The etherpad-cli-client patch in
ether/etherpad-load-test#TBD is forward-compatible; production
deployments enabling the flag MUST also ship a forward-compatible
browser bundle.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@qodo-code-review
Copy link
Copy Markdown

Qodo reviews are paused for this user.

Troubleshooting steps vary by plan Learn more →

On a Teams plan?
Reviews resume once this user has a paid seat and their Git account is linked in Qodo.
Link Git account →

Using GitHub Enterprise Server, GitLab Self-Managed, or Bitbucket Data Center?
These require an Enterprise plan - Contact us
Contact us →

@qodo-free-for-open-source-projects
Copy link
Copy Markdown

Review Summary by Qodo

Implement engine.io WebSocket transport-level packet packing (lever 8)

✨ Enhancement

Grey Divider

Walkthroughs

Description
• Implement engine.io WebSocket transport-level packet packing to reduce frame overhead
• Monkey-patch WebSocket transport to coalesce multi-packet sends into single frames
• Add enginePacking setting (default false) with forward-compatibility warnings
• Maintain single-packet fast path for steady-state quiet-pad behavior
Diagram
flowchart LR
  A["Multiple buffered<br/>engine.io packets"] -->|"enginePacking: true"| B["encodePayload<br/>coalescing"]
  B -->|"Single WS frame"| C["Reduced per-frame<br/>syscall overhead"]
  A -->|"Single packet or<br/>enginePacking: false"| D["Legacy fast path<br/>per-packet frames"]
  D --> E["Unchanged behavior"]
Loading

Grey Divider

File Changes

1. src/node/utils/EnginePacking.ts ✨ Enhancement +86/-0

New engine.io WebSocket packing patch module

• New module implementing WebSocket transport monkey-patch for packet coalescing
• Resolves engine.io's encodePayload and patches WebSocketTransport.prototype.send
• Multi-packet sends (N > 1) go through encodePayload and emit single WS frame
• Single-packet sends preserve legacy fast path with pre-encoded-frame optimization
• Includes idempotent installation guard and error handling with logging

src/node/utils/EnginePacking.ts


2. src/node/hooks/express/socketio.ts ✨ Enhancement +10/-0

Conditionally install engine packing at server boot

• Add conditional lazy-loading of EnginePacking module before socket.io Server construction
• Call installEngineWsPacking() only when settings.enginePacking === true
• Ensures patched transport prototype is active when engine.io instantiates transports
• Lazy require prevents module load for deployments not using the feature

src/node/hooks/express/socketio.ts


3. src/node/utils/Settings.ts ⚙️ Configuration changes +18/-0

Add enginePacking setting with forward-compatibility warnings

• Add enginePacking: boolean field to SettingsType type definition
• Add enginePacking: false default setting with comprehensive documentation
• Document warning about client forward-compatibility requirements
• Explain that multi-packet frames require clients to detect \x1e separator and use
 decodePayload
• Note that production deployments must coordinate rollouts with compatible browser bundles

src/node/utils/Settings.ts


Grey Divider

Qodo Logo

@qodo-free-for-open-source-projects
Copy link
Copy Markdown

qodo-free-for-open-source-projects Bot commented May 16, 2026

Code Review by Qodo

🐞 Bugs (2) 📘 Rule violations (1) 📎 Requirement gaps (0)

Context used

Grey Divider


Action required

1. enginePacking undocumented in template 📘 Rule violation ⚙ Maintainability
Description
A new public config setting enginePacking is added and wired into server boot, but the user-facing
configuration documentation/template is not updated to include it. This risks operators being
unaware of the flag and its compatibility warning.
Code

src/node/utils/Settings.ts[R662-678]

+  /**
+   * Pack multiple engine.io packets into a single WebSocket frame
+   * (ether/etherpad#7756 lever 8). engine.io's WS transport otherwise
+   * sends one frame per packet, while the polling transport already
+   * batches via encodePayload. Enabling this matches the polling
+   * coalescing behaviour on the WS path; at high fan-out rates it cuts
+   * the WS frame count proportionally to packets-per-flush.
+   *
+   * WARNING: enabling this requires connected clients to recognise
+   * payload-encoded WS frames (split on the `\x1e` record separator
+   * and decodePayload). Clients that pass each frame straight to
+   * decodePacket will fail to parse a multi-packet frame and silently
+   * miss those messages. The etherpad browser bundle and
+   * etherpad-cli-client are forward-compatible (they detect the
+   * separator); third-party clients may not be. Coordinate rollouts.
+   */
+  enginePacking: false,
Evidence
The checklist requires documentation updates when adding/changing configuration. The PR adds the new
enginePacking setting (with warnings) in code, but the configuration template that documents
settings.json options does not include any enginePacking entry in the relevant socket.io/socket
transport section.

src/node/utils/Settings.ts[662-678]
settings.json.template[709-740]
Best Practice: Repository guidelines

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
A new configuration option (`enginePacking`) was introduced and is used at runtime, but it is not documented in the repository’s configuration documentation, so users will not discover it or its rollout warnings.

## Issue Context
`enginePacking` is added to `SettingsType` and defaulted to `false` with detailed inline comments, and it gates a runtime behavior change in socket.io/engine.io startup.

## Fix Focus Areas
- settings.json.template[709-740]
- src/node/utils/Settings.ts[662-678]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools



Remediation recommended

2. Installer blocks retry 🐞 Bug ☼ Reliability
Description
installEngineWsPacking() sets installed = true before verifying engine.io modules/shapes, so any
early return (missing modules or unexpected shape) prevents future install attempts in the same
process even though the patch was never applied.
Code

src/node/utils/EnginePacking.ts[R38-58]

+  if (installed) return;
+  installed = true;
+
+  let WebSocketTransport: any;
+  let encodePayload: any;
+  try {
+    // Resolve from inside engine.io's own dependency closure so we pick up
+    // exactly the engine.io-parser the transport uses, not a duplicate copy.
+    WebSocketTransport = require('engine.io/build/transports/websocket').WebSocket;
+    encodePayload = require('engine.io-parser').encodePayload;
+  } catch (err: any) {
+    logger.warn(`Unable to install engine.io WS packing (modules not found): ${err && err.message || err}`);
+    return;
+  }
+  if (typeof WebSocketTransport !== 'function' ||
+      typeof WebSocketTransport.prototype !== 'object' ||
+      typeof encodePayload !== 'function') {
+    logger.warn('engine.io shape is unexpected; skipping WS packing patch');
+    return;
+  }
+
Evidence
The function sets installed = true and then can return early from the catch or from the shape
guard, leaving the patch uninstalled but preventing any future calls from doing work due to the `if
(installed) return;` guard.

src/node/utils/EnginePacking.ts[38-58]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
`installEngineWsPacking()` marks the patch as installed before confirming that it can actually install. If module resolution fails or the engine.io shape check fails, the function returns with `installed === true`, blocking any later attempt to install in the same process.

## Issue Context
This module is explicitly intended to be idempotent and safe to call from multiple boot paths. The current flow makes failures “sticky” within the process lifetime.

## Fix Focus Areas
- src/node/utils/EnginePacking.ts[38-58]

## Proposed fix
- Move `installed = true` to *after* all require/shape validation passes and immediately before/after the prototype is successfully wrapped.
- Optionally, use a second flag (e.g. `installing`) to avoid double-wrapping if re-entrancy is a concern, but do not permanently set `installed` on failure.

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


3. Original send not validated 🐞 Bug ☼ Reliability
Description
The patch captures WebSocketTransport.prototype.send without verifying it is a function, then
unconditionally calls originalSend.call(...) in the single-packet path, which will throw a
TypeError if send is missing or non-callable for the resolved transport export.
Code

src/node/utils/EnginePacking.ts[R60-70]

+
+  WebSocketTransport.prototype.send = function (packets: any[]) {
+    // Single-packet sends keep the legacy fast path: per-frame encoding
+    // including the pre-encoded-frame optimisation. Only fan-out bursts
+    // (writeBuffer accumulated more than one packet between flushes) are
+    // packed — for the steady state of a quiet pad, behaviour is identical
+    // to the upstream implementation.
+    if (!Array.isArray(packets) || packets.length < 2) {
+      return originalSend.call(this, packets);
+    }
+
Evidence
The code stores originalSend without a type check and then uses originalSend.call(...) when
packets.length < 2, which will throw if originalSend is not a function.

src/node/utils/EnginePacking.ts[60-70]

Agent prompt
The issue below was found during a code review. Follow the provided context and guidance below and implement a solution

## Issue description
The patch assumes `WebSocketTransport.prototype.send` exists and is callable. If the resolved `WebSocketTransport` export is not the expected engine.io transport (or if engine.io changes), `originalSend` may be `undefined` or non-function, and the single-packet fast path will crash when calling `originalSend.call(...)`.

## Issue Context
There is already a shape check for `WebSocketTransport` and `encodePayload`, but it does not check the presence/type of `WebSocketTransport.prototype.send`.

## Fix Focus Areas
- src/node/utils/EnginePacking.ts[60-70]

## Proposed fix
- Add `if (typeof originalSend !== 'function') { logger.warn(...); return; }` before overriding the prototype.
- If you keep the early `installed` flag, ensure it is not left `true` when you return due to this validation (ideally combined with the fix from the other finding).

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools


Grey Divider

Qodo Logo

Comment on lines +662 to +678
/**
* Pack multiple engine.io packets into a single WebSocket frame
* (ether/etherpad#7756 lever 8). engine.io's WS transport otherwise
* sends one frame per packet, while the polling transport already
* batches via encodePayload. Enabling this matches the polling
* coalescing behaviour on the WS path; at high fan-out rates it cuts
* the WS frame count proportionally to packets-per-flush.
*
* WARNING: enabling this requires connected clients to recognise
* payload-encoded WS frames (split on the `\x1e` record separator
* and decodePayload). Clients that pass each frame straight to
* decodePacket will fail to parse a multi-packet frame and silently
* miss those messages. The etherpad browser bundle and
* etherpad-cli-client are forward-compatible (they detect the
* separator); third-party clients may not be. Coordinate rollouts.
*/
enginePacking: false,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Action required

1. enginepacking undocumented in template 📘 Rule violation ⚙ Maintainability

A new public config setting enginePacking is added and wired into server boot, but the user-facing
configuration documentation/template is not updated to include it. This risks operators being
unaware of the flag and its compatibility warning.
Agent Prompt
## Issue description
A new configuration option (`enginePacking`) was introduced and is used at runtime, but it is not documented in the repository’s configuration documentation, so users will not discover it or its rollout warnings.

## Issue Context
`enginePacking` is added to `SettingsType` and defaulted to `false` with detailed inline comments, and it gates a runtime behavior change in socket.io/engine.io startup.

## Fix Focus Areas
- settings.json.template[709-740]
- src/node/utils/Settings.ts[662-678]

ⓘ Copy this prompt and use it to remediate the issue with your preferred AI generation tools

@JohnMcLear
Copy link
Copy Markdown
Member Author

Closing. Scored against the dive on run 25954316731: at step 350 (just below the cliff), engine-packing has p95=109ms and apply_mean=23.86ms vs baseline p95=76ms / apply_mean=16.15ms — neutral-to-slightly-worse, within noise. The patch is not the win we hoped for.

Why — engine.io's socket.flush() calls transport.send(writeBuffer) and resets the buffer to []. It fires as soon as transport.writable === true, and for WebSocket writable returns to true within microseconds of each write. So even at 10k+ packets/sec, the writeBuffer rarely accumulates more than one packet. My patch only activated when packets.length > 1, so it almost never triggered.

For a real transport-level packing win we'd need to deliberately delay flush — for example, by deferring it to the next setImmediate or microtask boundary so multiple sendPacket calls within one tick accumulate. That's an invasive change to engine.io's flush semantics, not a per-method monkey-patch. Worth investigating but a much bigger PR than this one was scoped to be.

Secondary finding — this same run showed websocket-only as the best lever, contradicting every previous dive. The matrix-jobs-on-different-physical-runners issue (each GHA matrix entry runs on a separate machine) means our cross-lever comparisons are cross-hardware. Runner noise can flip lever conclusions. Real dive methodology going forward needs N≥3 runs per lever to filter out runner noise before drawing strong conclusions. Worth flagging on the dive doc when next updated.

The harness-side forward-compat patch in ether/etherpad-load-test#106 (already merged) stays — it's cheap forward-compat in case a future server-side change uses payload-encoded frames intentionally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant